The Relation of Closed Itemset Mining, Complete Pruning Strategies and Item Ordering in Apriori-Based FIM Algorithms

نویسندگان

  • Ferenc Bodon
  • Lars Schmidt-Thieme
چکیده

In this paper we investigate the relationship between closed itemset mining, the complete pruning technique and item ordering in the Apriori algorithm. We claim, that when proper item order is used, complete pruning does not necessarily speed up Apriori, and in databases with certain characteristics, pruning increases run time significantly. We also show that if complete pruning is applied, then an intersection-based technique not only results in a faster algorithm, but we get free closeditemset selection concerning both memory consumption and run-time. The theoretical claims are supported by results from a comprehensive set of experiments, involving hundreds of tests on numerous databases with different support thresholds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Mining of Association Rulesusing Closed

| Discovering association rules is one of the most important task in data mining. Many eecient algorithms have been proposed in the literature. The most noticeable are Apriori, Mannila's algorithm, Partition, Sampling and DIC, that are all based on the Apriori mining method: pruning the subset lattice (itemset lattice). In this paper we propose an eecient algorithm, called Close, based on a new...

متن کامل

Accelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting

Frequent Itemset Mining (FIM) is one of the most investigated fields of data mining. The goal of Frequent Itemset Mining (FIM) is to find the most frequently-occurring subsets from the transactions within a database. Many methods have been proposed to solve this problem, and the Apriori algorithm is one of the best known methods for frequent Itemset mining (FIM) in a transactional database. In ...

متن کامل

WFIM: Weighted Frequent Itemset Mining with a weight range and a minimum weight

Researchers have proposed weighted frequent itemset mining algorithms that reflect the importance of items. The main focus of weighted frequent itemset mining concerns satisfying the downward closure property. All weighted association rule mining algorithms suggested so far have been based on the Apriori algorithm. However, pattern growth algorithms are more efficient than Apriori based algorit...

متن کامل

A Probability Analysis for Frequent Itemset Mining Algorithms

Since the introduction of the Frequent Itemset Mining (FIM) problem, several different algorithms for solving it were proposed and experimentally analyzed. Our work focusses on the theoretical analysis of FIM. The aim is to give a detailed probabilistic study of the performance of FIM algorithms for different data distributions. It is joint work with Dirk Van Gucht and Paul Purdom from Indiana ...

متن کامل

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005